Belief-based nonlinear rescoring in Thai speech understanding

نویسندگان

  • Chai Wutiwiwatchai
  • Sadaoki Furui
چکیده

This paper proposes an approach to improve speech understanding based on rescoring of N-best semantic hypotheses. In rescoring, probabilities produced by an understanding component are combined with additional probabilities derived from system beliefs. While a normal rescoring approach is to multiply or linearly interpolate with belief probabilities, this paper shows that probabilities from various sources are better combined using a nonlinear estimator. Using the proposed model together with a dialogue-state dependent semantic model shows a significant improvement when applying to a Thai interactive hotel reservation agent (TIRA), the first spoken dialogue system in Thai language.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech recognition using non-linear trajectories in a formant-based articulatory layer of a multiple-level segmental HMM

This paper describes how non-linear formant trajectories, based on ‘trajectory HMM’ proposed by Tokuda et al., can be exploited under the framework of multiple-level segmental HMMs. In the resultant model, named a non-linear/linear multiple-level segmental HMM, speech dynamics are modeled as non-linear smooth trajectories in the formant-based intermediate layer. These formant trajectories are m...

متن کامل

Fuzzy class rescoring: a part-of-speech language model

Current speech recognition systems usually use word-based trigram language models. More elaborate models are applied to word lattices or N best lists in a rescoring pass following the acoustic decoding process. In this paper we consider techniques for dealing with class-based language models in the lattice rescoring framework of our JANUS large vocabulary speech recognizer. We demonstrate how t...

متن کامل

Towards an improved model of dynamics for speech recognition and synthesis

This thesis describes the research on the use of non-linear formant trajectories to model speech dynamics under the framework of a multiple-level segmental hidden Markov model (MSHMM). The particular type of intermediate-layer model investigated in this study is based on the 12-dimensional parallel formant synthesiser (PFS) control parameters, which can be directly used to synthesise speech wit...

متن کامل

Rescoring under fuzzy measures with a multilayer neural network in a rule-based speech recognition system

In this paper, a speech rescoring system is developed on a set of phonetic hypotheses produced by a bottom-up knowledge-based decoder. An original method to automatically compute a fuzzy membership function from top-down acoustic rules statistics is compared with a possibilistic measure. To aggregate the fuzzy degrees into a phonetic score, a mutilayer neural network is trained on the results o...

متن کامل

Task Dependent Loss Functions in Speech Recognition: Search over Recognition Lattices

A recognition strategy that can be matched to specific system performance criteria such as word error rate or F-measure has recently been found to yield improvements over the usual maximum a-posteriori probability strategy [1] [2] [3]. In this matched-to-the-task strategy a hypothesis is chosen to minimize the expected loss or the Bayes Risk under a loss function defined by a performance measur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004